Novel Alzheimer's disease subtypes based on functional brain connectivity in human connectome project

The pathogenesis of Alzheimer's disease (AD) remains unclear, but revealing individual differences in functional connectivity (FC) may provide insights and improve diagnostic precision. A hierarchical clustering-based autoencoder with functional connectivity was proposed to categorize 82 AD patients from the Alzheimer's Disease Neuroimaging Initiative. Compared to directly performing clustering, using an autoencoder to reduce the dimensionality of the matrix can effectively eliminate noise and redundant information in the data, extract key features, and optimize clustering performance. Subsequently, subtype differences in clinical and graph theoretical metrics were assessed. Results indicate a significant inter-subject heterogeneity in the degree of FC disruption among AD patients. We have identified two neurophysiological subtypes: subtype I exhibits widespread functional impairment across the entire brain, while subtype II shows mild impairment in the Limbic System region. What is worth noting is that we also observed significant differences between subtypes in terms of neurocognitive assessment scores associations with network functionality, and graph theory metrics. Our method can accurately identify different functional disruptions in subtypes of AD, facilitating personalized treatment and early diagnosis, ultimately improving patient outcomes.


Materials and methods
Data collection and sharing for this study is funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI, http:// adni.loni.usc.edu).All methods are carried out in accordance with relevant guidelines and regulations.All experimental protocols are approved by the institutional review board (IRB) at Hangzhou Dianzi University (IRB-2020001) and the ethics committee at Beijing Hospital (2022BJYYEC-375-01).
ADNI is managed by various organizations including the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the U.S. Food and Drug Administration (FDA), and the National Institute on Aging (NIA).Its primary objective is to advance knowledge regarding the pathophysiology of AD, develop biomarkers for the disease, enhance diagnostic techniques for early AD detection, and refine clinical trial methodologies.

Subjects
ADNI is managed by various organizations including the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the U.S. Food and Drug Administration (FDA), and the National Institute on Aging (NIA).Its primary objective is to advance knowledge regarding the pathophysiology of AD, develop biomarkers for the disease, enhance diagnostic techniques for early AD detection, and refine clinical trial methodologies.
A total of 82 AD patients and 50 control normal (CN) individuals, 65 men and 67 women, with ages ranging from 55 to 90, were involved in the study.Each participant underwent both resting-state fMRI and T1 weighted MRI scans.Additionally, the research team gathered demographic information of each participant, such as age, gender, education level.Neurological examination scale scores, including Mini-Mental State Examination (MMSE) and Rey Auditory Verbal Learning Test (RAVLT), were also obtained.

Clustering analysis
A[N × N] matrix is generated by calculating the correlation between each pair of brain regions, where N rep- resents the total number of brain regions.Given the correlation matrix's symmetry, typically only the upper triangular part is utilized as FC data in subsequent studies.This triangular part is converted into a vector format.Hierarchical clustering is a widely employed method for clustering that requires calculating the distance or similarity between each pair of samples.However, as the data dimension increases, distance calculations become more complex and time-consuming.Additionally, in high-dimensional spaces, data tends to become sparse, rendering distance calculations less reliable and leading to less accurate outcomes.
Deep learning has gained significant popularity in brain functional connectivity and network analysis 20,21 .It enables the exploration of connection patterns and network structures among various brain regions, thereby providing valuable insights into brain function and cognitive processes.An unsupervised technique called autoencoder 22 is commonly used for data dimensionality compression and feature expression.By compressing input data into a lower-dimensional latent space and then reconstructing the original data, autoencoder effectively reduce dimensionality.Additionally, they can learn intricate nonlinear feature representations that capture higher-order and more complex structures and patterns in the data.Before applying hierarchical clustering 23 , we use an autoencoder to reduce the dimensionality of the data and improve the clustering effect.The autoencoder employs a reconstruction loss function, which encourages the model to learn the most informative features of the input data during the encoding process 24 .By minimizing the reconstruction error, the autoencoder effectively selects the most relevant features that capture the underlying patterns in the data.This process serves to further reduce data dimensionality and enhance the overall clustering performance.Subsequently, hierarchical clustering is conducted, treating each subject as an individual cluster.Through the iterative merging of clusters using the Ward connection method, a new cluster hierarchy is gradually formed.
To assess the robustness of the results across different values, we performed the analysis on replicates ranging from 2 to 9. The Silhouette Coefficient 25 considers both the average distance between data points within clusters and the average distance to data points in neighboring clusters.The Silhouette Coefficient ranges from -1 to 1, with higher values indicating better separation.On the other hand, the Davies-Bouldin Score 26 evaluates cluster separation by considering the average dissimilarity between each cluster and its most similar neighboring cluster.A lower Davies-Bouldin Score indicates a better clustering performance.The optimal number of clusters ( k = 2 ), as determined by the joint evaluation of Silhouette Coefficient and Davies-Bouldin Score, is depicted in Fig. 2.

Intra-network functional connectivity
For each participant, we ranked 360 brain regions based on the Yeo 7 functional networks 27 , and assigned each brain region to the network with which it had the maximum overlap, using the maximum overlap rule 28 .Functional network connectivity for each network was calculated to assess the overall integrity of functional network modules.The intra-network FC 29 can be expressed by the following Eq.( 1), (1) where n X represents the numbers of brain regions contained in the network module, r ij represents the correla- tion coefficient between region i and region j .Finally, we examined the correlations between age, education, neuroassessment scale scores, and functional modules.

Graph theory analysis
False connections or weak connections that connect areas with weaker weights may have less contribution to neural pathways and can introduce noise that affects computational results.Hence, it is necessary to binarize the previously obtained functional connections before computing graph theory parameters.To efficiently identify the optimal threshold for global cost-benefit maximization, the matrix can be subjected to sparse processing using the strongest weight proportional thresholding method.The aim of this sparsification process is to mitigate the influence of spurious connections by filtering out those with weaker weights.The optimal proportion of the strong weights (PSW) aids in determining the optimal threshold that maximizes global cost-effectiveness while excluding connections with lower weights.By enhancing genuine neural pathway connections and reducing false connections caused by noise, we can enhance the reliability and accuracy of data analysis.The mathematical expression for this process is as follows, where E represents the global benefit, represents the local benefit of the first node in the brain network, N is the set of brain nodes, and d ij represents the shortest connected path between node i and node j .This paper uses an exhaustive search method to determine the PSW value.We set the search range from 0 to 100%, with a step size of 1%, and record the GCE value for each PSW.Finally, we identify the PSW value corresponding to the maximum GCE value as the threshold used for sparsity processing.
In order to gain a more precise understanding of the functional characteristics of the network and capture the relationship between brain structural properties and functional activities, we employed graph-theoretic metrics on the thresholded matrix.Specifically, we utilized the Brain Connectivity Toolbox 30 (BCT, available at https:// sites.google.com/ site/ bctnet/) to calculate various indicators, including clustering coefficient, kcoreness, local efficiency and strength.These metrics provide valuable insights into the network's organization and efficiency.Individual differences may increase the variance of the data, thereby masking the true difference signal and making differential analysis more complex and challenging.Individual normalization can eliminate scale and amplitude differences between different individuals, making comparisons between different individuals more comparable.Therefore, we performed individual normalization for each subject.To accurately assess differences in functional connectivity matrices between subtypes, GLM analyses were performed in a univariate manner, incorporating age and sex as covariates.This approach enabled us to examine the specific effects of subtypes while controlling for potential confounding factors.Finally, we employed a false discovery rate (FDR) correction to account for multiple comparisons and control the error rate.Overall, by employing graph-theoretic analysis, GLM modeling, and applying appropriate statistical corrections, we aimed to provide a comprehensive and reliable assessment of the functional connectivity differences between subtypes, accounting for relevant demographic factors and mitigating the risk of spurious associations.

Ethical approval
This study was approved by the institutional review board (IRB) at Hangzhou Dianzi University (IRB-2020001), and the ethics committee at Beijing Hospital (2022BJYYEC-375-01).

Consent to participate
Patient consent was waived due to the anonymization of all sensitive information in the collected data.

FC-based subtypes of AD
Through multiple repeated experiments, we ensured the stability and reliability of the study, successfully identifying two subtypes.Subsequently, we examined the demographic characteristics of the two subtypes and the control group, which comprised 60 participants (73.17%) in subtype I, 22 participants (26.83%) in subtype II and 50 healthy participants.There are trends in our data to suggest that AD patients were more likely to be classified as subtype I, while they are less likely to be classified as subtype II.The mean age of subtype I was 73.36 years, while subtype II had a mean age of 74.22 years, slightly higher than that for subtype I. Gender distribution was balanced across the three groups.We employed ANOVA analysis, Chi-squared test and Kruskal-Wallis test to examine whether there are differences in age, gender, and scores on the mental scale.The results indicate that there were no significant differences among the three groups in terms of age and gender (both p > .05 ), but there were significant differences in educational level and neuroimaging scores.Table 1 below provides an overview of the participants' demographic and clinical characteristics.
Correlation matrices can provide valuable insights into the functional connectivity between different brain regions.Our research revealed a significant heterogeneity in functional connectivity among participants with AD.Despite all of them exhibiting pronounced memory deficits, subtype I display more severe functional disruption across the entire brain, as illustrated in Fig. 3.As age increases, the brain will exhibit varying degrees of atrophy in individuals with CN, accompanied by a certain decline in cognitive abilities.The functional connectivity of subtype II remains largely preserved, displaying similarities with the CN group and even demonstrating stronger connectivity than the CN group.This variation may arise from genetic factors, lifestyle choices, and other cardiovascular and cerebrovascular conditions.Consequently, we have labeled class I patients as the "malignant subtype" and class II patients as the "benign subtype".The median FC matrices exhibited uniform abnormalities in both the LS and its associated resting-state networks (RSN) within the two subtypes.This compelling evidence prompts us to posit that the focal point of the disease may reside within the LS network, highlighting its potential as a critical contributor to the cognitive decline observed in individuals with AD.
As shown in Fig. 4, subtype I exhibits significant differences from the control group across all seven brain networks ( p < 0.001 ), with its median functional connectivity values being lower than both CN and subtype II.This indicates that subtype I has functional connectivity abnormalities across multiple brain networks, which significantly differ from normal aging and are associated with the development of AD. www.nature.com/scientificreports/Compared to this, subtype II showed less pronounced differences in the five functional networks from Sensorimotor Network (SN) to Frontoparietal Network (FN) compared to the control group.However, the differences were relatively significant ( p < 0.05 ) in the Visual Network (VN) and Default Mode Network (DMN), with subtype II's median functional connectivity higher than CN.The better correlation matrix connectivity in the subtype II group may suggest a higher network density.In the early stages of AD, the brain may compensate for neurodegenerative changes by enhancing the connectivity of certain functional networks.This compensatory mechanism may help maintain certain cognitive functions, even if the brain has undergone atrophy.These findings partially support previous conclusions regarding FC analysis, suggesting that FC in AD may be enhanced 31,32 .

Association between FC and demographic characteristics
Linear regression is a widely used statistical analysis method.We examined the correlation between patients' demographic characteristics and specific network FC.Within the malignant subtype, we observed a significant negative correlation between Intra-network FC of SN and age ( r = −0.272,p = 0.036 ).However, no significant correlation was observed between age and intra-network FC in the benign subtype.Meanwhile, the seven functional connectivities in CN showed a similar significant negative correlation with age.Based on the comprehensive research results, we can conclude that FC within specific brain regions tends to decrease with the increase in age, and this effect varies between the two subtypes.
Subsequently, we conducted an analysis to examine the relationship between the intra-network FC and educational level.A positive correlation between the intra-network FC and educational level was observed in the seven functional networks of benign patients.However, it is important to note that this correlation trend did not reach a statistically significant level in the significance test.Further research and a larger sample size may be needed to establish a conclusive relationship between the intra-network FC and educational level in this context.On the other hand, in CN, we discovered a negative correlation ( r = −0.288,p = 0.043 ) between the intra-network FC of SN and the educational level.Figure 5 shows the plot of the correlation coefficient between the intra-network FC values and descriptive characteristics.

Association between FC and cognitive scores
MMSE is commonly used to assess an individual's intellectual state and cognitive abilities, while RAVLT is primarily employed to evaluate learning and memory capabilities, particularly in terms of language and speech materials retention.In this section, our investigation centers on exploring the relationship between the scores of MMSE and RAVLT with intra-network FC.As shown in Fig. 6, the malignant subtype of intra-network FC in LS decreased with the RAVLT_learning score increased ( r = −0.289,p = 0.025 ), which is not consistent with our perception.Additionally, LS modules were found to be integrated with RAVLT_forgetting scores negatively correlated ( r = −0.294,p = 0.022 ).However, no significant correlation was observed between intra-network FC and MMSE and RAVLT series scores in both benign subtypes and CN.

Graph theory analysis
In graph theory, local indicators such as clustering coefficient, kcoreness, and local efficiency are individually analyzed for each brain region, providing insights into the degree of modularity, information transmission capacity, and fault tolerance within relatively independent regions of the brain network 33 .Through GLM analysis of graph theory metrics, we have identified several significant regions in four graph theory metrics, primarily located within the LS and DMN networks, such as OFC 34,35 , and TE1m 12 brain regions (See Appendix Table A1 for HCPMMP atlas information).In terms of kcoreness and strength metrics, some nodes within FN also exhibit significant differences.Furthermore, many VN nodes are also significant in terms of strength.These significant nodes are illustrated in Fig. 7.Meanwhile, we compared each subtype with the control group.Figure 8 displays the significant brain regions between the control group and subtype I, while subtype II does not have significant nodes compared to CN.Table 2 lists the nodes with significant differences among the four metrics.

Discussion
Existing heterogeneity studies have predominantly relied on sMRI or PET imaging 37 .This study, utilizing fMRI imaging, has successfully identified two subtypes of AD, namely, the "malignant subtype" and the "benign subtype".These two AD subtypes exhibit significant differences in FC, suggesting that AD is not a singular disease entity, and specific subtypes may be more responsive to certain drugs or treatment approaches.
Firstly, we noticed that the malignant subtype exhibits extensive functional connectivity loss, while the benign subtype only shows mild impairments within the LS and its associated RSNs.We hypothesize that the compensatory mechanisms may account for the better performance of FC in the benign subtype, while the LS may be the primary affected area in the disease.The first area of the brain to be impacted in early AD (benign subtype) is   www.nature.com/scientificreports/ the DMN, which is also the main reason for memory loss.This damage gradually affects additional networks as the illness worsens, turning it into a malignant subtype.LS, which includes key structures such as the hippocampus, cingulate gyrus, and amygdala, plays a crucial role in regulating the generation, expression, and control of emotions, as well as in the formation and storage of memories 38,39 .Early-stage AD is typically accompanied by neuronal loss and inflammatory responses.Detecting changes in LS (biomarker) through biomarker testing can serve as an early indicator of AD, aiding in early diagnosis and intervention.Previous studies have indicated that the disruption of structural network topology in MCI and AD patients primarily occurs in regions within LS 40 .Enhancing connectivity within LS may potentially help maintain memory function, possibly serving as a compensatory mechanism 41 .This aligns with our findings.Secondly, the findings show a significant negative correlation between intra-network FC of SN and age in the malignant subtype.These findings suggest that in both malignant and benign subtypes, the internal connectivity of brain networks may be influenced by age, and this influence may differ between the two subtypes.Subtype II exhibits higher FC and positive correlation after a specific age, although not significantly, but it still reflects the potential initiation of certain compensatory mechanisms.The role of age in functional connectivity and correlation is a complex and important variable.During aging, these compensatory mechanisms may enhance functional connectivity to maintain cognitive function or other brain activities.This finding contributes to providing potential clues for personalized therapy.For instance, treatment for subtype I patients may require more attention to the stability of neural networks, whereas therapy for subtype II patients may involve strategies to support and enhance brain compensatory mechanisms.
Third, RAVLT_forgetting score measures the short-term memory capacity of the subjects, with lower scores indicating that the subjects are better at retaining and recalling learned information in delayed recall tasks.We have identified a negative correlation between LS modules and RAVLT_forgetting scores.In addition, we have also uncovered some intriguing findings.The intra-network FC of the malignant subtype LS decreases as the RAVLT_learning score increases, which is inconsistent with our initial expectations 42 .We speculate that patients with the malignant subtype of AD may exhibit poorer adaptability in learning and memory, requiring more cognitive effort to remember these words.In other words, when the FC value has already fallen below a certain threshold, additional cognitive burden may render their brains unable to maintain normal functional connections, resulting in a reduction in intra-network FC in the default mode network.Of course, this is just an intriguing hypothesis, and the actual reasons may be more complex, warranting further research for a deeper investigation.
Finally, there are significant differences in graph-theoretical metrics.DMN is closely associated with memory retrieval and mind-wandering, primarily involving brain regions such as the frontal lobe, temporal lobe, and posterior cingulate cortex 43 .Compared to the benign subtype/CN, the malignant subtype shows significant differences in clustering coefficient, k-coreness, local efficiency, and node strength in many nodes within DMN and LS.These regions are involved in normal information transmission and processing functions.Additionally, significant nodes are also present in FN and VN.FN and VN are responsible for advanced cognitive and visual information processing.Impairment of their functions leads to difficulties in decision-making and visual processing in patients.In summary, the significant differences in these graph theory metrics reflect pronounced distinctions in brain network structure and function between malignant subtypes and benign subtypes.

Figure 5 .
Figure 5. Correlation analysis between intra-network FC values and demographic characteristics.All analyses were conducted using Pearson correlation analysis.The p values indicate the level of significance for the observed correlations.

Figure 6 .Figure 7 .
Figure 6.Correlation analysis between intra-network FC values and RAVLT score.All analyses were conducted using Pearson correlation analysis.The p values indicate the level of significance for the observed correlations.

Figure 8 .
Figure 8.The significant node in the HCPMMP atlas for subtype I and CN is analyzed using graph theory (N is the number of ROIs).

Table 1 .
Demographic and clinical characteristics of AD subtypes.Date are presented as either a number or the mean (SD).a One-way ANOVA; b Chi-squared test; c Kruskal-Wallis test.